Mapping of Sequence Reads to the Reference Genomes ◾ 71
In Section 2.1, we showed how to download the FASTA file of the reference genome
sequence of an organism and how to index it using “samtools faidx”. So, if you did not do
that, follow the steps in that section to download the human reference and then to index
it. The sequences of reference genomes can also be downloaded from other databases such
as UCSC database. We have also downloaded and compressed example paired-end FASTQ
files for practice. The next step is to show you how to use an aligner (BWA, Bowtie, and
STAR) for read mapping.
2.3.2.1 Burrows–Wheeler Aligner
The Burrows–Wheeler Aligner (BWA) is a sequence aligner that uses BWT and FM-index.
We can install the latest version of the BWA software by following the installation instruc-
tions at “https://github.com/lh3/bwa” or we can use the following commands:
git clone https://github.com/lh3/bwa.git
cd bwa; make
The above will clone the BWA source files into your working directory and then it will
compile it. Once BWA has been installed successfully, you may need to set its path so that
FIGURE 2.17 The per base quality reports for the reads in the two FASTQ files.